Overview of Stemming Algorithms for Indian and Non-Indian Languages
نویسندگان
چکیده
Stemming is a pre-processing step in Text Mining applications as well as a very common requirement of Natural Language processing functions. Stemming is the process for reducing inflected words to their stem. The main purpose of stemming is to reduce different grammatical forms / word forms of a word like its noun, adjective, verb, adverb etc. to its root form. Stemming is widely uses in Information Retrieval system and reduces the size of index files. We can say that the goal of stemming is to reduce inflectional forms and sometimes derivationally related forms of a word to a common base form. In this paper we have discussed different stemming algorithm for non-Indian and Indian language, methods of stemming, accuracy and errors. Keywords— Over-stemming, Under-stemming, Rule based stemming.
منابع مشابه
A Comprehensive Analyze of Stemming Algorithms for Indian and Non-indian Languages
Stemming is a technique used for reducing inflected words to their stem or root form. This is applicable for both the suffix as well as prefix. Stemming is a preprocessing step in text mining application and commonly used for Natural Language Processing (NLP). A stemmer can execute operation of altering morphologically identical words to root word without performing morphological analysis of th...
متن کاملLiterature Review: Stemming Algorithms for Indian and Non-Indian Languages
I. Introduction Stemming plays an important role in Information Retrieval System (IRS) for improving the performance of all languages. The goal of stemming is to diminish inflectional and derivational variant forms of a word to a common base form. A stemmer can execute operation of transforming morphologically identical words to root word without performing morphological analysis of that term. ...
متن کاملA Literature Review: Stemming Algorithms for Indian Languages
Stemming is the process of extracting root word from the given inflection word. It also plays significant role in numerous application of Natural Language Processing (NLP). The stemming problem has addressed in many contexts and by researchers in many disciplines. This expository paper presents survey of some of the latest developments on stemming algorithms in data mining and also presents wit...
متن کاملStatistical Investigation and Comparative Assessment of the Non-Performing Assets of Indian Commercial Banks
Non-performing assets (NPAs) have been a major cause of concern for Indian commercial banks in the recent past years. Many studies have been reported on the different aspects of NPAs in Indian banking system. However, there is a crucial lack of investigation on the comparative assessment of various types of banks such as Public sector banks, Private sector banks and foreign banks so that the tr...
متن کاملShear Waves Through Non Planar Interface Between Anisotropic Inhomogeneous and Visco-Elastic Half-Spaces
A problem of reflection and transmission of a plane shear wave incident at a corrugated interface between transversely isotropic inhomogeneous and visco-elastic half-spaces is investigated. Applying appropriate boundary conditions and using Rayleigh’s method of approximation expressions for reflection and transmission coefficients are obtained for the first and second order approximation of the...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1404.2878 شماره
صفحات -
تاریخ انتشار 2014